{ "cells": [ { "cell_type": "markdown", "id": "87c7431e-a51a-4381-af1d-7eb3fc5c626b", "metadata": {}, "source": [ "# Geospatial mapping with Folium" ] }, { "cell_type": "markdown", "id": "6cf916c4-c25c-43aa-b952-e68e454ec599", "metadata": {}, "source": [ "Here we will use Pyinaturalist and Folium to create interactive html maps that plot iNaturalist observations. In this example we will create a simple map, where each observation can be clicked on to produce a pop-up with a thumbnail image, basic observation information, as well as a link to the observation page on iNaturalist.\n", "\n", "[Folium](https://python-visualization.github.io/folium/latest/index.html) is a very powerful mapping library which takes care of the background details such as downloading and stitching map tiles together. It allows for choosing from several different base map tile providers.\n", "\n", "The first thing to do is import all the necessary libraries. The non-standard libraries used in this notebook are Folium, [Dateutil](https://dateutil.readthedocs.io/en/stable/) , [Pandas](https://pandas.pydata.org/), and of course, Pyinaturalist, so you will have to install them using `pip` or `conda`, as per your Python installation." ] }, { "cell_type": "code", "execution_count": null, "id": "41b12373-8423-46f7-b475-40b37c018b75", "metadata": {}, "outputs": [], "source": [ "import json\n", "from os.path import exists\n", "\n", "import folium\n", "import pandas as pd\n", "from dateutil import parser, tz\n", "from folium.plugins import HeatMap\n", "\n", "from pyinaturalist import (\n", " iNatClient,\n", " pprint,\n", ")\n", "\n", "# Create a client for API requests\n", "client = iNatClient()" ] }, { "cell_type": "markdown", "id": "f05ab07d-0f29-4676-b479-21e47bceaed9", "metadata": {}, "source": [ "### Get location and taxon IDs, and edit customizations\n", "\n", "For this example, we will create a map of all bee observations in the town of Osoyoos, BC. You are free to run the code as is, or you can choose your own favorite taxons and location and produce a map for that. If so, the annotated source code will explain where to make changes.\n", "\n", "In order to get observations for a certain geographic area, you need to know how iNaturalist encodes that area in terms of a `place_id`. All geographic places, small or large, are assigned a unique number, and this number can be [non-trivial to find](https://forum.inaturalist.org/t/is-there-a-place-where-i-can-go-to-get-a-list-of-inaturalist-place-ids/4016). Luckily, iNat (and pyinaturalist) provide an API to search on plain text place names. " ] }, { "cell_type": "code", "execution_count": null, "id": "aa232b73-fb47-42c6-91a1-11c57160d114", "metadata": {}, "outputs": [], "source": [ "# Search for a place ID by name\n", "places = client.places.autocomplete(q='Osoyoos').all()\n", "pprint(places)" ] }, { "cell_type": "markdown", "id": "042b9a7c-0f43-48fb-9000-623f2e997404", "metadata": {}, "source": "If you already know the taxon id, scientific name, or common name of the taxon(s) you are interested in, then you can add these directly to the `taxon_name=` or `taxon_id=` arguments in the call to `client.observations.search()`. Otherwise, there is a similar API for retrieving taxon IDs given a search on common or scientific names. You can use the optional `rank=` argument to limit responses to specific taxon ranks. " }, { "cell_type": "code", "execution_count": null, "id": "6bd24179-0150-449a-a3c5-f669895135c3", "metadata": {}, "outputs": [], "source": [ "taxa = client.taxa.autocomplete(q='bees', rank=['family', 'epifamily']).all()\n", "pprint(taxa)" ] }, { "cell_type": "markdown", "id": "8b99c918-6a0e-4b19-9bd2-a0b0a300e208", "metadata": {}, "source": [ "So from the above two calls we can see that the `place_id` for Osoyoos is '121320', and the `taxon_id` for all bees is '630955' (epifamily anthophila). Let's assign those values to some variables. Note that if you want to include multiple taxons, you can combine them in a list and specify that as the argument to `TAXON_ID`. We'll also create a variable named `DATASET_NAME` which will be the title for both the JSON response we will save locally, and the name of the html file for the completed map." ] }, { "cell_type": "code", "execution_count": 4, "id": "ec05b635-a815-4f6d-b0ea-0be0135b5282", "metadata": {}, "outputs": [], "source": [ "# Change to something appropriate\n", "DATASET_NAME = 'osoyoos_bees'\n", "\n", "DATASET_FILENAME = f'{DATASET_NAME}.json'\n", "DATASET_MAPNAME = f'{DATASET_NAME}.html'\n", "\n", "# Place id (Choose from results of get_places_autocomplete call above...)\n", "PLACE_ID = 121320\n", "\n", "# Use either taxon name or taxon id\n", "# If you are interested in multiple taxons, you can add them in a list:\n", "# TAXON_ID=[6933, 558438]\n", "TAXON_NAME = 'Epifamily Anthophila' # Bees!\n", "TAXON_ID = 630955 # Bees!" ] }, { "cell_type": "markdown", "id": "2081b2bc-20c0-45a3-9453-4af1c72a2800", "metadata": {}, "source": [ "### Get the data from iNat" ] }, { "cell_type": "markdown", "id": "beb927b8-6466-454e-a7a5-abea37928052", "metadata": {}, "source": [ "Now we will get the actual observtion data using Pyinaturalist. It is important to keep in mind that this call does place real demands on the underlying iNaturalist server infrastructure, so it is best practice to not make the call too excessively broad. For example, getting all mallard observations in Canada will be a ridiculously large request. When experimenting with this code, try to create a query with limited geographic scope, specialized and rarer taxons, or both. \n", "\n", "To further save on network resources, the following code will only make the API request if there is no local file named `osoyoos_bees.json` (or whatever the value is that you assigned to `DATSET_NAME`. This will ensure that continued use of the notebook will reload the observations from the locally saved copy, rather than performing another API request." ] }, { "cell_type": "code", "execution_count": null, "id": "1561c973-2dfd-4022-bf66-62c58b3e19b6", "metadata": {}, "outputs": [], "source": [ "if not exists(DATASET_FILENAME):\n", " print('Making API call...')\n", "\n", " # The API call. See pyinaturalist documentation for additional arguments\n", " # you can pass to this call. For example, you can filter based on temporal\n", " # values, research grade, observations that belong to individual iNat projects,\n", " # specific users, specific identifiers, and many, many more.\n", " observations = client.observations.search(\n", " # taxon_name=TAXON_NAME,\n", " taxon_id=TAXON_ID,\n", " photos=True,\n", " geo=True,\n", " geoprivacy='open',\n", " place_id=PLACE_ID,\n", " ).all()\n", "\n", " # Save results for future usage (convert to JSON format)\n", " observations_json = {'results': [obs.to_dict() for obs in observations]}\n", " with open(DATASET_FILENAME, 'w') as f:\n", " json.dump(observations_json, f, indent=4, sort_keys=True, default=str)\n", "else:\n", " print('Using local copy of data...')" ] }, { "cell_type": "markdown", "id": "c45e4ad0-6dfa-49dd-9f81-2b9bad75c999", "metadata": {}, "source": [ "### Read data into a Pandas Dataframe, and extract needed features" ] }, { "cell_type": "markdown", "id": "43d93f51-9d60-41c1-90c5-b3e52ceb7ffb", "metadata": {}, "source": [ "In this cell we load the JSON response from disk, and flatten it into a format that can be used to make a Pandas DataFrame. As iNaturalist observations are recorded in UTC, we normalize date values to the local timezone, then extract the observation coordinates for use on the map. Nothing in this cell needs to be customized, it should run as-is for any set of observations." ] }, { "cell_type": "code", "execution_count": 6, "id": "71569fc0-54e2-404e-a3b5-ca4aff506905", "metadata": {}, "outputs": [], "source": [ "# Read the local JSON data\n", "with open(DATASET_FILENAME) as f:\n", " d = json.load(f)\n", "\n", "# Flatten nested JSON value, and load data into a Pandas DataFrame\n", "df = pd.json_normalize(d['results'])\n", "\n", "# Normalize timezones\n", "local_tz = tz.tzlocal()\n", "\n", "\n", "def to_local_tz_from_str(s):\n", " try:\n", " dt = parser.isoparse(s)\n", " return dt.astimezone(local_tz)\n", " except (TypeError, ValueError):\n", " return None\n", "\n", "\n", "df['observed_on_local'] = df['observed_on'].apply(to_local_tz_from_str)\n", "\n", "# Extract lat/long\n", "df[['lon', 'lat']] = pd.DataFrame(df['geojson.coordinates'].to_list(), index=df.index)" ] }, { "cell_type": "markdown", "id": "fc2448bd-a271-422a-8686-79d9f2c17d5f", "metadata": {}, "source": [ "### Create the interactive html map with data plotted\n", "\n", "And now the code for the actual map. Folium requires a set of coordinates to center the map tiles on, so we calculate this using the maxima and minima of the lat/long data itself. In this example, we create two base map layers. The first uses the CartoDB Positoron map tiles, and the second uses OpenTopomap tiles. I am a big fan of the OpenTopomap tiles for naturalist data, as it does a great job of displaying local trails through parks and other natural areas, and of course, the topographic lines have their own intrinsic value.\n", "\n", "After setting up the base layers, we loop through the observations, and create a plot for each one, including the html code for the pop-ups. We also add a heatmap annotation, which can be toggled on and off. This can be useful to visualize observation hot spots. We create a bounding box, again, using the bounds of the observation data in order to create a reasonable initial zoom level for the map. We include a widget to select between the baselayers, then save the map to a local html file." ] }, { "cell_type": "code", "execution_count": 7, "id": "d766c9b9-0a06-4fb3-b75e-969fb45e18c2", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Map saved to osoyoos_bees.html\n" ] } ], "source": [ "# -----------------------------------------------\n", "# Required: DataFrame with 'lat', 'lon', and iNat data already parsed\n", "# Example assumes these fields are available:\n", "# - 'taxon.name' (scientific)\n", "# - 'taxon.preferred_common_name' (common name)\n", "# - 'observed_on_local' (datetime object)\n", "# - 'user.login' (observer username)\n", "# - 'uri' (link to the iNaturalist observation)\n", "# -----------------------------------------------\n", "\n", "# Create the base map centered on your area of interest\n", "# Get the bounding box of your data\n", "south = df['lat'].min()\n", "north = df['lat'].max()\n", "west = df['lon'].min()\n", "east = df['lon'].max()\n", "\n", "# Calculate rough center of data\n", "center_lat = (south + north) / 2\n", "center_lon = (west + east) / 2\n", "\n", "m = folium.Map(\n", " location=[center_lat, center_lon],\n", " zoom_start=13,\n", " tiles=None, # We'll add a custom tile layer next\n", ")\n", "\n", "# Add CartoDB basemap\n", "folium.TileLayer(tiles='CartoDB Positron', name='Carto DB').add_to(m)\n", "\n", "# Add OpenTopoMap basemap\n", "folium.TileLayer(\n", " tiles='https://{s}.tile.opentopomap.org/{z}/{x}/{y}.png',\n", " attr='Map data © OpenStreetMap, SRTM | Map style © OpenTopoMap (CC-BY-SA)',\n", " name='OpenTopoMap',\n", ").add_to(m)\n", "\n", "\n", "# Format the datetime as a nice string for popups\n", "def format_datetime(dt):\n", " try:\n", " return dt.strftime('%a, %B %d, %Y, %I:%M %p')\n", " except Exception:\n", " return 'Unknown date/time'\n", "\n", "\n", "# Loop through observations and add a marker for each one\n", "for _, row in df.iterrows():\n", " lat = row['lat']\n", " lon = row['lon']\n", "\n", " scientific = row.get('taxon.name', 'Unknown')\n", " common = row.get('taxon.preferred_common_name', 'Unknown')\n", " observer = row.get('user.login', 'Unknown')\n", " observed = format_datetime(row.get('observed_on_local'))\n", " link = row.get('uri', '#')\n", " # This is the iNaturalist default taxon photo\n", " img_url = row.get('taxon.default_photo.square_url', '')\n", " # To use the actual observation photo, comment out the above line,\n", " # and uncomment the line below.\n", " # img_url = row['observation_photos'][0]['photo']['url']\n", "\n", " # Construct HTML popup content\n", " popup_html = f\"\"\"\n", "
\n", " {common} ({scientific})
\n", " Observed on: {observed}
\n", " By: {observer}\n", " \"\"\"\n", "\n", " # Add marker\n", " folium.CircleMarker(\n", " location=[lat, lon],\n", " radius=2,\n", " color='black',\n", " fill=True,\n", " fill_opacity=0.8,\n", " popup=folium.Popup(popup_html, max_width=300),\n", " ).add_to(m)\n", "\n", "# Auto-fit the map bounds to the data\n", "# This will ensure the zoom-level of the initial map is reasonable\n", "m.fit_bounds([[south, west], [north, east]])\n", "\n", "# Add a toggleable heat map overlay\n", "heat_data = df[['lat', 'lon']].values.tolist()\n", "HeatMap(heat_data, radius=20, name='heatmap').add_to(m)\n", "\n", "# Add layer control (useful for multiple basemaps or overlays)\n", "folium.LayerControl().add_to(m)\n", "\n", "# Save the map to HTML\n", "m.save(DATASET_MAPNAME)\n", "print(f'Map saved to {DATASET_MAPNAME}')" ] }, { "cell_type": "markdown", "id": "8c53bdf8-93d5-40fc-b22f-ac94f5511fab", "metadata": {}, "source": [ "While the map has been saved to an html file, which can be opened and viewed using any web browser, it is also possible to render the map directly inside the notebook simply by calling on the map object: " ] }, { "cell_type": "code", "execution_count": 8, "id": "ee381a3a-3196-4abc-aa64-1553ed409bc2", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n"
      ],
      "text/plain": []
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "
Make this Notebook Trusted to load map: File -> Trust Notebook
" ], "text/plain": [ "\u001b[1m<\u001b[0m\u001b[1;95mfolium.folium.Map\u001b[0m\u001b[39m object at \u001b[0m\u001b[1;36m0x11c4439b0\u001b[0m\u001b[1m>\u001b[0m" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m" ] }, { "cell_type": "markdown", "id": "0d91cbc8-eead-4247-898d-c10630ca0e42", "metadata": {}, "source": [ "There is a great deal you can do to customize the look and feel of maps created with Folium, from the underlying baselayer map tiles, to the size and shape of the map markers themselves, and the information shown in the pop-ups. Consult the Folium documentation for more information on customizations. The map created here is just the beginning of what you can do." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.1" } }, "nbformat": 4, "nbformat_minor": 5 }